Nature Microbiology
○ Springer Science and Business Media LLC
Preprints posted in the last 7 days, ranked by how well they match Nature Microbiology's content profile, based on 133 papers previously published here. The average preprint has a 0.15% match score for this journal, so anything above that is already an above-average fit.
de Hesselle, H. C.; Garben, B.-F.; Stark, K. J.; Warth, R.; Teumer, A.; Pattaro, C.; Heid, I. M.; Winkler, T. W.
Show abstract
Chronic kidney disease is characterized by decreased glomerular filtration rate (eGFR, estimated from serum creatinine or cystatin C) or increased urinary albumin-to-creatinine-ratio (UACR). Genome-wide association studies provided the genetic make-up of these traits, but their overlap remained largely unknown. Our multi-trait GWAS (N=1M) identified 812 signals and multi-trait fine-mapping sharpened the identification of likely causal variants. Of 333 signals classified for filtration function or albuminuria, only 11 overlapped. Their effects on eGFR and UACR were directionally concordant, dominated by eGFR and independent of HbA1c or mean arterial pressure. Mapped genes pinpointed mechanisms related to glomerular filtration area (SHROOM3, EPB41L5) and sodium-mediated intraglomerular pressure (NRBP1, DPEP1/CHMP1A). Genetics of fluid intake resulted in shadow effects on UACR without albumin leakage into urine. Our multi-trait approach sharpened the identification of likely causal genes for kidney traits, demonstrated largely distinct genetics for filtration function versus albuminuria, and provided new biological insights into the overlap.
Sajib, M. S.; Tanmoy, A. M.; Kanon, N.; Jui, A. B.; Islam, M. S.; Dola, N. Z.; Hossain, M. M.; Mobarak, R.; Shahidullah, M.; Hoque, M.; Ahmed, A. N. U.; Holmes, A. H.; Saha, S. K.; Saha, S.; Wan, Y.; Hooda, Y.
Show abstract
Background Healthcare-associated infections pose a major burden to neonatal health worldwide and remain difficult to track in low-resource hospitals because patient movement data and pathogen genomic data are rarely integrated into actionable transmission models. Existing approaches are often restricted to specific settings, highly structured electronic health records (EHRs), or analyses focused on either patient movements or pathogen characteristics alone. To address this gap, we developed PathoPath, an open-source integrative modelling platform, and evaluated its utility in a high burden paediatric hospital in Dhaka, Bangladesh. Methods PathoPath is an open-source R package that combines electronic health records with whole genome sequencing data to generate contact networks from direct and indirect contacts using minimal structured inputs. We retrospectively applied PathoPath to 373 cases of Klebsiella pneumoniae species complex (KpSC) infection identified in 2021 at the largest paediatric referral hospital in Dhaka, Bangladesh. Ward level patient movement trajectories were used to reconstruct contact networks, and genomic data from isolates from children <60 days were integrated to identify probable dissemination of bacterial clones and antimicrobial resistance plasmids. Findings PathoPath identified 750 direct contacts among 317 patients, forming 25 connected components, with the largest including 93 patients. KpSC infections were identified across 21 of 37 wards, with the neonatal intensive care unit accounting for 77.9% of all cases. Integration of genomic and network data distinguished sustained clustering of ST147 from multiple probable inter-clonal dissemination events involving IncFII plasmids carrying blaNDM-5 and/or blaOXA-181 within ST16. Four dominant sequence types accounted for 65.6% of sequenced isolates, and carbapenemase genes were detected in 95.8%. Interpretation PathoPath reconstructs hospital-wide contact networks and integrates them with pathogen genomics to map probable dissemination of pathogens and antimicrobial resistance using minimal structured clinical data. It could support more targeted infection prevention and control in hospitals where granular digital records are not available.
BEAVOGUI, A. H.; Doumbia, S.; Kieh, M.; Leigh, B.; Sow, S.; Lhomme, E.; Ben-Farhat, S.; Dubois Cauwelaert, N.; Roy, C.; Diouf, W.; Idrissa, S.; Diarra, S.; Millimouno, N. P.; Diallo, F. A.; Kamara, M.; Pratt, D.; Dicko, I.; Kennedy, S. B.; Esperou, H.; Choi, E. M.; Kpetigo, A.-M. D.; D'Ortenzio, E.; Diallo, A.; Lancrey-javal, S.; Hamze, B.; Schwimmer, C.; Wiedemann, A.; Ayouba, A.; Peeters, M.; Lane, H. C.; Higgs, E.; Watson-Jones, D.; Yazdanpanah, Y.; Greenwood, B.; RICHERT, L.; Levy, Y.; PREVAC study team,
Show abstract
Background: The World Health Organization has expanded its recommendations for prophylactic Ebola vaccination for at-risk populations. Durable vaccine-induced immunity is important for sustaining outbreak preparedness in regions with recurrent Ebola virus disease (EVD). We assessed five-year persistence of vaccine-induced immune responses in adults and children from the PREVAC trial. Methods: Two large randomised phase 2 trials (NCT02876328), in adults and children aged [≥]1 year, were conducted in four west African countries. Participants were randomly assigned to placebo or to one of three Ebola vaccine strategies: Ad26.ZEBOV followed by MVA-BN-Filo at 56 days; rVSV{Delta}G-ZEBOV-GP followed by placebo; or rVSV{Delta}G-ZEBOV-GP followed by a homologous booster dose at 56 days. After 12 months of follow-up, the primary results were published, participants unblinded to their vaccine assignment, and follow-up continued for 60 months. After Month 24, placebo group recipients were offered active vaccination. Anti Ebola virus glycoprotein Immunoglobulin G (IgG) concentrations were measured for 5 years. Findings: 1401 adults and 1401 children were initially randomized, and 1315 (93.9%) adults and 1322 (94.4%) children attended at least one long-term visit. Retention was high, with 95% followed beyond 1 year and 83% completion at 5-year follow-up. For the three vaccine strategies, antibody geometric mean concentrations (GMC) declined modestly between Months 12 and 24, followed by a stable plateau from Months 24 to 60. At Month 60, antibody GMC were higher in the rVSV-based groups (1099 and 1216 EU/ml for adults; 1982 and 2347 EU/ml for children) than in the Ad26.ZEBOV, MVA-BN-Filo group (252 adults and 645 EU/ml children). Antibody persistence at Month 60 was heterogeneous, varying by age, sex, country, and baseline IgG concentration. Interpretation: Licensed Ebola vaccines induced sustained antibody responses in adults and children for up to 5 years. While the protective antibody level is unknown, these data demonstrate long-lasting immune responses from currently employed vaccine strategies.
Colosi, E.; Calmon, L.; Fässli, M.; Koch, K.; Bielicki, J. A.; Colizza, V.
Show abstract
Pooled testing programs were introduced during the COVID-19 pandemic to expand surveillance capacity while preserving testing resources, but evidence on their epidemiological impact in schools under real-world conditions remains limited. We analyzed data from the pooled testing program implemented in public primary schools of the canton of Basel-Landschaft, Switzerland, during the Fall-Winter 2021 Delta wave. We used an agent-based transmission model informed by pooled and individual testing results, school characteristics, contact networks, and community incidence. The model was fitted to pooled positivity ratios in four clusters of administrative areas with similar epidemic trajectories. We compared pooled testing with alternative protocols in terms of school transmission, testing volume, and student-days lost. During the study period, pooled testing was offered to 21'187 students across 62 public primary schools, with high and stable participation across clusters (mean 71-79%). The fitted model reproduced observed pool positivity trends well. Compared with pooled testing, reactive class closure, reactive screening, and symptomatic testing were associated with higher in-school transmission, with excess ranging from 50% to 87%, 63% to 104%, and 72% to 133% across clusters. Weekly individual screening achieved similar reductions in transmission but required 15-25 times more tests. Relaxing class closure after depooling substantially reduced student-days lost without increasing transmission. Under real-world conditions, pooled testing provided an effective and resource-efficient strategy to reduce SARS-CoV-2 transmission in primary schools. Combining early detection of asymptomatic infections with low testing demands, pooled testing offers a scalable approach to school surveillance and control for pandemic response in educational settings.
Wilks, A.; Lofters, J.; Lee, J.; Milton-Hicks, J.; Klings, E.; Steinberg, M.
Show abstract
Fetal hemoglobin (HbF) prevents the polymerization of sickle hemoglobin (HbS). HbF, measured usually as a percent of total hemoglobin (%HbF), is inversely associated with the severity of sickle cell disease (SCD) but fails to capture the distribution of HbF concentrations within red blood cells (RBCs). The relative proportion of HbF and HbS within a RBC is reflected by the HbF:HbS ratio whereas HbF/F-cell quantifies the absolute amount of HbF/RBC. While correlated, HbF:HbS ratio and HbF/F-cell are not interchangeable. In the context of mean corpuscular hemoglobin (MCH), HbF/F-cell approximates whether sufficient HbF is present to inhibit HbS polymerization. We examined the association of mean HbF/F-cell with sub-phenotypes of sickle cell disease in three independent cohorts. Both %HbF and HbF/F-cell were significantly associated with multiple clinical and laboratory features of SCD; however, HbF/F-cell demonstrated stronger associations with clinical severity measures across cohorts. Higher HbF/F-cell was associated with fewer clinical events, reduced hemolysis, and mortality. Changes in HbF/F-cell after hydroxyurea treatment were associated with ~11-13% reduction in acute events in patients with <1 pg increase and >60% reduction with a >5 pg increase in HbF/F-cell. For each pg increase in HbF/F-cell there was ~6% reduction in the rate of acute events. As a surrogate for the distribution of HbF concentrations among F-cells, HbF/F-cell adds physiologically relevant insights that could guide prognosis and treatment
Naing, L.; de Mattos Barbosa, M. G.; Connell, I. P.; Chicca, J.; Zhao, Z.; Reister, N. A.; Bruchez, A.; Greenspan, N.; McComsey, G.; Platt, J. L.; Cascalho, M.
Show abstract
Acute respiratory distress syndrome (ARDS) is a devastating complication of respiratory infections; however, the biological mechanisms that initiate its onset are poorly defined. Here we show that TNFRSF13B polymorphisms increase the risk of ARDS following SARS-CoV-2 infection up to 7.4-fold compared to the WT genotype. The increased risk was not due to immune-deficiency or impaired virus neutralization. On the contrary, TNFRSF13B mutant subjects mounted better antibody neutralization compared to subjects with WT TNFRSF13B. However, IgG from subjects expressing TNFRSF13B variants had less sialic acid, terminal galactose, and fucose than IgG from subjects with a WT genotype. Moreover, IgG from TNFRSF13B mutant subjects exhibited increased recruitment of complement factors. Thus, besides well-known actions governing plasma cell differentiation, TNFRSF13B impacts both affinity maturation and effector functions of IgG in ways that independently govern complement activation controlling inflammatory responses known to trigger ARDS.
Fu, B.; DeSchepper, L. B.; Sun, J.; McKeithen-Mead, S. A.; Kapili, B.; Ochoa-Andersen, P.; Spencer, S. P.; Fardeen, T.; Ricardo, M.; El Kamari, V.; Sinha, S.; Relman, D. A.; Grembi, J. A.; Shalon, D.; Estrela, S.; Huang, K. C.
Show abstract
The human small intestine (SI) plays a central role in nutrient processing, host-microbe interactions, and immune regulation, yet remains poorly characterized due to the lack of minimally disruptive sampling methods. Here, we present a protocol for deploying, recovering, and analyzing samples collected using an ingestible device that enables multi-region, lumen-targeted SI sampling during normal digestion. The device incorporates a ~30-cm collapsible tube wound into pH- or time-responsive layers that sequentially unfurl in situ, typically capturing three spatially ordered samples with high yield and reliable retrieval. This protocol outlines study design, participant handling, device recovery, contamination control, and standardized workflows for analyses, including cell quantification, culturomics, sequencing, and metabolomics. We further describe benchmarking approaches for evaluating spatial resolution and strategies for assay prioritization when sample volume is limiting. By reducing participant burden and facilitating integration with stool, saliva, and clinical metadata, this approach enables longitudinal and large-cohort studies linking SI microbial ecology and host physiology to human health.
Cai, L.; DeBerardinis, R. J.
Show abstract
Heterozygous carriers of autosomal recessive disease variants are conventionally considered unaffected, yet population-scale genomic datasets reveal subclinical carrier phenotypes. MMACHC encodes a cobalamin-processing protein whose biallelic loss causes cobalamin C deficiency, an inborn error of intracellular cobalamin metabolism. We performed an unbiased quantitative phenome-wide association screen in All of Us Research Program v8 to identify phenotypes associated with rare heterozygous MMACHC burden variants. Serum/plasma vitamin B12 was the top quantitative association. Carriers had higher circulating B12 than non-carriers in adjusted analyses, but also higher homocysteine, suggesting that elevated circulating B12 does not reflect improved intracellular cobalamin function. Carriers were less likely to fall below conventional B12 insufficiency thresholds, indicating a potential diagnostic blind spot. A pathway-wide rare-variant gene-burden (All-by-All) gene-burden analysis placed this finding in broader biological context. Burdens in genes related to circulating B12 binding or intestinal absorption were associated with lower circulating B12. In contrast, burdens in several genes involved in cellular delivery and intracellular cobalamin handling were associated with higher circulating B12. This step-specific directionality supports a model in which elevated circulating B12 can reflect impaired cellular handling and consequent systemic accumulation rather than improved cellular cobalamin availability. Because EHR-derived B12 is shaped by heterogeneous clinical and medication contexts, prospective carrier-enriched studies with standardized methylmalonic acid, homocysteine, diet, supplement, medication, comorbidity, and symptom ascertainment are needed to evaluate functional-marker-based screening.
Fanelli, F.; Parino, F.; Poletto, C.; Colizza, V.
Show abstract
The 2026 Bundibugyo Ebola outbreak in eastern Democratic Republic of the Congo (DRC) has already generated international spread to Uganda, raising concerns about further regional and international dissemination. Using International Air Transport Association origin-destination passenger flows, we assessed relative exposure to Ebola virus disease importation into Europe under six outbreak expansion scenarios reflecting plausible pathways of geographical spread, including cross-border transmission and amplification in highly connected regional capitals. Relative exposure patterns remained largely unchanged under localized transmission in eastern DRC and border-spillover scenarios. Expansion into South Sudan generated a first structural increase in importation pressure to Europe through the connectivity associated with Juba, while hypothetical amplification in Kampala, Kigali, and Kinshasa substantially increased importation pressure and reshaped exposure patterns across Europe. Across all scenarios, France, Italy, and the United Kingdom remained among the most exposed countries. Mobility-informed scenario analyses support preparedness as the geography of the outbreak evolves.
Munyangi wa Nkola, J.; Akilimali Zalagile, P.; Lukuke Mbutshu, H.; Kabala Munyemo, S.; Ramazani Bin Eradi, I.; CAMARA, A.
Show abstract
Background: Artemisinin-based combination therapies remain the mainstay of malaria control strategies; nevertheless, the advent of genetic markers linked to partial artemisinin resistance in Plasmodium falciparum has elicited substantial concern across African settings. To assess the prevalence, geographic distribution, and clinical associations of these molecular markers, we undertook a systematic review and meta-analysis of observational cohort studies.Methods: We conducted a search of cohort studies published between January 2015 and June 2025, following PRISMA 2020 guidelines. We queried databases including PubMed/MEDLINE, Scopus, Web of Science, and CINAHL. Eligibility required prospective enrollment of patients, longitudinal monitoring (therapeutic efficacy studies), and pfkelch13 propeller domain genotyping.Results: A meta-analytical synthesis of 888 isolates from six core prospective cohorts revealed a pooled prevalence of 6% (95% CI: 2.1%-11.8%) for validated pfkelch13 mutations. A profound geographic dichotomy was identified: while West and Central African cohorts maintained a 0% prevalence, East African hotspots showed significant expansion, with prevalence reaching 12.8% in Rwanda and up to 25.5% in Northern Uganda; high statistical heterogeneity (, ) reflects this biological divergence. Conclusions: These findings highlight the established and expanding presence of artemisinin partial resistance in East Africa. Standardized surveillance is essential to adapt malaria control policies across the continent. Keywords: Africa; artemisinin resistance; clinical indicators; pfkelch13 gene; molecular markers; partial resistance; Plasmodium falciparum.
Chung, R.; Chalasani, N. S.; Barbehenn, A. S.; Lundgren, E.; Savur, S.; Shome, S.; Sheikhzadeh, C. H.; Sarvadhavabhatla, S.; Donaire, M. S.; Pae, V.; Chu, X.; Winder, D.; Maguire, C. T.; Topal, S.; Ganesan, A.; Yabes, J. M.; Larson, D. T.; Lalani, T.; Ewers, E. C.; Colombo, R. E.; Dugan, E.; Rathore, U.; Marson, A.; Agan, B. K.; Tomalka, J. A.; Sekaly, R.-P.; Loannidis, N. M.; Lee, S. A.
Show abstract
People with HIV exhibit elevated inflammation and cardiovascular risk despite antiretroviral therapy. To define the genetic architecture of inflammasome-associated inflammation, we performed whole-genome sequencing and quantified plasma IL-6, IL-1{beta}, and IL-18 in 1,000 ART-suppressed PWH from the U.S. Military HIV Natural History Study. Genome-wide analyses identified 14 loci implicating antiviral defense (DDX17, DDX41, EEA1, BCL11A), lipid metabolism (ABCA1, ABCA12, ABCC1, AGMO), and vascular remodeling (KLHL29, RNF213, ETV1). Transcriptome-wide analyses across cardiovascular and immune tissues identified regulatory programs linking interferon signaling, immune activation, and vascular biology to circulating cytokine levels. Mendelian randomization analyses supported causal relationships between inflammasome-associated cytokines and vascular events. Functional integration with genome-wide CRISPR perturbation datasets in primary CD4 T cells linked cytokine-associated loci to HIV antiviral pathways and cytokine regulatory networks. External validation in cohorts without HIV demonstrated pathway-level convergence despite limited variant-level overlap. These findings define genetic mechanisms linking inflammasome signaling, antiviral defense, and cardiovascular risk.
Cantrell, L.; Karampatsas, K.; Andrews, N.; Beach, S.; Bentley, E.; Berardi, A.; Bijlsma, M. W.; Cagil Kocana, C.; Daniel, O.; French, N.; Hall, T.; Izu, A.; Khalil, A.; Kwatra, G.; Kyohere, M.; Madhi, S. A.; Mboizi, R.; Miselli, F.; Nielsen, M.; Thorn, N.; van de Beek, D.; Walker, K.; Heath, P. T.; Le Doare, K.; Voysey, M.; PREPARE WP3 Study Group,
Show abstract
Vaccines to prevent infant group B streptococcus (GBS) disease are advancing, with licensure likely based on safety and immunologic endpoints rather than clinical efficacy data. This approach requires robust, generalisable serological thresholds of risk reduction (SToRRs). We combined data from six case-control studies in Europe and Africa to define SToRRs for early-onset (EOD) and late-onset (LOD) GBS disease. Across diverse epidemiological and healthcare settings, anti-capsular polysaccharide IgG concentrations were consistently higher in infants who remained disease free than in those who developed disease. Higher antibody concentrations were required to reduce the risk of EOD than LOD, and higher concentrations were required for serotype Ia than for serotype III. This study provides a quantitative framework to support correlates-based evaluation and potential licensure of maternal GBS vaccines.
Sinharoy, S.; Mink, T.; Ogutu, E. A.; Patrick, M.; Nuncio, M. d. C. A.; Bolanos Gamez, M. V.; Oglesby, H.; Ngo, C. P.; Antonio, S.; Medina Lopez, E. R.; Mwangi, P.; Koome, P.; Otuya, P. A.; Ruto, P.; Otieno Onyango, R.; Caruso, B. A.
Show abstract
Women's disproportionate responsibility for unpaid domestic and care work, including water collection, remains a barrier to gender equality globally and may constrain women's ability to engage in income-generating activities. We compared women's and men's time use in rural Kenya and Honduras and assessed whether women's time spent on water collection and income-generating activities differed between communities that had or had not received an improved water source from World Vision. We also examined the measurement of time-use agency among women and men. In-person surveys were conducted in July-August 2024 with 95 participants (48 women, 47 men) in six Kenyan communities and 102 participants (53 women, 49 men) in six Honduran communities. Surveys included a 24-hour time-use recall module and items on time-use agency. Analyses compared time use by gender and by community intervention status (improved vs. not yet improved water supply), and confirmatory factor analysis assessed the validity of the time-use agency measure. Women in both study sites spent substantially more time than men on unpaid domestic and care work activities, including cooking, cleaning, laundry, and caregiving. In Kenya, women also spent significantly more time collecting water. Men spent more time sleeping (Kenya), on paid work (Honduras), unpaid agricultural work (both settings), and traveling (both settings). Across both countries, there were no significant differences between intervention and comparison communities in women's time spent on water collection or income-generating activities. In Kenya, most respondents reported high influence over their time, and six items showed strong validity for measuring instrumental time-use agency. Women's time burdens remained high even in communities that had received improved water sources, including at the household level. Our results suggest that more transformative water infrastructure, combined with interventions that address gendered social norms, may be needed to meaningfully reduce women's domestic work burden and support their economic empowerment.
Schwoebel, J.; Semenec, I.; Rousseva, J.; Frasch, M. G.; Thorstenson, R.; Bhatt, M.
Show abstract
Large language models embedded in autonomous agents process trusted instructions and untrusted data in one context window, leaving them open to direct and indirect prompt injection. In healthcare this is not hypothetical: a 2025 JAMA Network Open study found commercial medical LLMs followed injected instructions in 94.4% of simulated patient encounters, including life threatening recommendations . Yet the clinically decisive problem we quantify here is different. Most real clinical threats protected health information PHI exfiltration, cross patient access, bulk export, out of scope advice are fluent, legitimate looking requests that carry no attack signal, so even a state of the art injection detector passes them. Existing runtime guardrails trade safety against latency: model based auditors are accurate but add hundreds of milliseconds of Python inference, while lexical filters are fast but blind to obfuscated or semantically disguised payloads. We present QFIRE, an inline, provider agnostic prompt firewall implemented as a single self contained Rust toolchain proxy, CLI, and benchmark harness. QFIRE combines three mechanisms: (i) positive security scope constraints, which restrict a model call to a declared natural language purpose and block out of scope drift even when no overt attack token is present; (ii) an asynchronous detector graph that runs N rules and their detector nodes concurrently, cheapest checks first; and (iii) a de obfuscation pass that decodes Base64 hex ROT13, folds homoglyphs and leetspeak, and strips zero width characters before detection. QFIRE ships 106 versioned firewall rules and a dedicated HIPAA Safe Harbor 18 identifier PHI panel, and runs a local DeBERTa v3 injection classifier via embedded ONNX Runtime. On 1968 public prompt injection and jailbreak prompts QFIREs deterministic hybrid attains F1 0.86, statistically tied with Metas state of the art PromptGuard 2 0.86 and above protectai DeBERTa v3 0.83; lexical baselines lag 0.16 to 0.50. Our central result is on QFIRE HealthBench, a new 2000 prompt healthcare benchmark we build and release with real garak and Microsoft PyRIT payloads. There the same PromptGuard-2 recovers only 0.40 recall DeBERTa v3 0.57, because most clinical threats carry no injection signal; QFIREs combined scope plus PHI chain reaches 0.83 recall F1 0.87 at a calibrated 0.08 false positive rate. Generic injection detection, even state of the art, is therefore necessary but not sufficient for healthcare agents. A bare LLM judge also closes most of this static corpus gap F1 0.90; QFIREs contribution beyond static accuracy is auditable determinism, bounded latency, and adaptive robustness, where the bare judge falls to 34 to 59% recall section 5.5. End to end, placing QFIRE in front of a tool using agent over a mock EHR sandbox cuts the agents harmful action rate from 0.38 to 0.00 at a 0.13 benign utility cost. All code, rules, corpora snapshots, and scripts are released, and every table regenerates from a single make paper target against local models with no paid API keys.
Xiang, J.; Zhu, B.; Xu, H.; Chen, Y.; Sun, X.; xiang, r.; Zhao, Y.; Liu, W.; Zhang, L.; He, J.; liu, j.; Chen, Y.; Fan, Z.; Zhang, H.; Tan, J.; Pang, L.; Shi, L.; Kong, Y.; Cai, A.
Show abstract
Background Thalassemia is one of the most common monogenic disorders worldwide, current screening strategies combining hematological testing with molecular assays still carry a risk of missed diagnoses and undesirable efficiency, particularly for complex structural variants and rare mutations. Methods In this prospective double-blind, multicenter cohort study of 3,842 participants (3,362 pregnant women and 480 male partners), we conducted a head-to-head comparison to systematically evaluate the incremental clinical value and detection performance of single-molecule nanopore sequencing in thalassemia (SMITH) against conventional hematological testing and next-generation sequencing (NGS). Findings The overall concordance rate between NGS and SMITH was 98.6% (3789/3842). The discrepant cases (n=53) were directly attributed to the superior detection capabilities of SMITH, which successfully identified complex structural rearrangements-including 45 -globin gene triplications and four HK alleles-that were missed by NGS. Furthermore, SMITH accurately detected four rare variants (c.134_135insT/, c.-22(C>T)/, {beta}N/{beta}c.316-290delinsAGGGCAATAATTT and {beta}3.5 kb deletion/{beta}N ) and resolved ten trans and three cis configurations within the globin gene allele. Clinically, these technical advantages translated to a 9.3% (5/54) increase in the detection rate of high-risk prenatal couples, effectively preventing one birth affected by moderate-to-severe thalassemia. Additionally, SMITH corrected a diagnostic discrepancy in one case (HK vs. -3.7), sparing the couple from an unnecessary invasive procedure. Interpretation Our findings demonstrate that SMITH provides a powerful platform for resolving globin gene rearrangements, detecting rare variants, and enabling direct haplotype phasing. By effectively eliminating diagnostic blind spots, SMITH is expected to become an optimal method for thalassemia prevention programs. Funding This study was supported by Chinese National Natural Science Foundation Projects 81760037 and 82271894.
Eisenberg, M.; Packer, R.; Shrine, N.; Demidov, G.; Pack, H.; Hollox, E. J.; Fawcett, K.
Show abstract
The contribution of multi-allelic CNVs (mCNVs) to disease risk has not been widely studied. This is largely because they have been difficult to characterise at a large-scale genome-wide, and are often not strongly associated with flanking SNVs, limiting imputation. Improved understanding of the role of mCNVs in disease risk could lead to novel insights into the pathobiology of disease. We robustly typed 69 mCNVs from UK Biobank whole exome sequences in discovery (n=150,682) and replication sets (n=269,317). Discovery and replication PheWAS used clinically-curated composite phenotypes by integrating self-report, primary and secondary health care data to interrogate these variants, for unrelated British individuals of African, European and Central/South Asian ancestries. 173 mCNV-phenotype associations were detected from 26 mCNVs, of which 114 associations replicated. One of eight potentially novel mCNV-phenotype signals was independent of neighbouring associated SNVs, the association of Sulfotransferase 1A1 and 1A2 genes (SULT1A1/SULT1A2) with estimated glomerular filtration rate (eGFR) in individuals of European ancestry (meta-analysed p=1.05x10-9, beta=0.016 [0.011; 0.021]). Other potentially novel associations include Golgi phosphoprotein 3 (GOLPH3) with the cardiovascular phenotype bundle branch block in individuals of South Asian ancestry (meta-analysed p=3.35x10-6, OR=2.13 [1.53, 2.96]) and alpha amylase 2B (AMY2B) with ventricular fibrillation and flutter in individuals of European ancestry (meta-analysed p=2.48x10-6, OR=1.50 [1.26; 1.78]). In summary, we show that accurate typing of biobank-scale sample sizes can identify associations between traits and mCNVs, acting through a gene dosage relationship. Our work provides several novel likely causative variants contributing to particular traits of clinical importance and immediately suggest a putative functional mechanism for the observed associations.
Vomo-Donfack, K. L.; Bousquet, G.; Falgarone, G.; Ginot, G.; Morilla, I.
Show abstract
Whole-genome sequencing comprehensively captures coding, non-coding and structural variation in families with suspected inherited disorders, yet its clinical utility remains constrained by an interpretation bottleneck: selecting a handful of relevant variants from millions of candidates. Current rule-based pipelines, anchored in ACMG/AMP criteria, excel at identifying highly penetrant Mendelian alleles but frequently miss variants of low-to-moderate penetrance, non-coding alterations and germline-somatic interactions. Here we introduce PolyCLIP-T, a topology-guided multimodal framework that transforms variant selection from a classification problem into a geometric discovery task. By contrastively aligning DNA-sequence embeddings with functional annotations, PolyCLIP-T constructs a unified latent space in which the displacement between reference and alternate embeddings quantifies the molecular perturbation induced by each variant. Persistent homology then identifies stable topological components - coherent variant groups shared among affected relatives - that transcend single-variant scoring logic. Applied to six families with multi-morbid cancer, autoimmune and cardiovascular disease, PolyCLIP-T recovered non-coding and structural candidates overlooked by conventional pipelines and revealed pleiotropic networks spanning disease categories. This approach provides an interpretable, scalable solution for genome-first investigations of disorders driven by polygenic architectures that evade single-variant analysis. The framework was developed and benchmarked on deeply characterised familial cohorts selected for transgenerational multimorbidity; validation in larger, independent populations will be essential to establish its generalisability. An interactive web tool is freely available at https://www.polyclip-t.uma.es/.
Taylor, A. R.; Foo, Y. S.; White, M. T.
Show abstract
Background: Reliable inference of Plasmodium vivax recurrence states - relapse, recrudescence and reinfection (the ``3Rs'') - improves estimates of antimalarial efficacy. The R package Pv3Rs features a Bayesian model designed for P. vivax molecular correction, i.e., using parasite genetic data to infer recurrence states. The model is an extension of a prototype built to analyse microsatellite data from the Vivax History (VHX) and Best Primaquine Dose (BPD) trials. Methods: We re-analysed data from 212 VHX and BPD trial participants (493 recurrences) using Pv3Rs, comparing results with those from the prototype and with genetic relatedness estimated using Dcifer, a tool for estimating relatedness based on identity-by-descent. Posterior recurrence state probabilities were computed using both uniform and time-to-event priors: artificial but equal prior probabilities facilitate posterior interpretation, while time-to-event priors leverage all available information and enable re-computation of failure rates. Relatedness estimates were used to identify and correct instances of model misspecification. Results: The Pv3Rs model generated posterior probabilities for all recurrences and was able to jointly model data on all episodes per participant for 89% of participants, compared with 73% using the prototype. Recurrence state probabilities were broadly consistent across methods, though the Pv3Rs model elevated reinfection probabilities slightly. Relatedness estimates exposed various outliers consistent with half-sibling parasites and/or genotyping errors. Outlier correction impacted some per-participant failure probabilities, but reinfection-adjusted radical-cure failure rates of high-dose primaquine remained near 3%, in line with previous findings. Conclusion: Re-analysis of VHX and BPD P. vivax genetic data restates earlier reinfection-adjusted efficacy estimates. It demonstrates the increased computational capability and misspecification sensitivity of Pv3Rs, highlighting a need for careful analyses. Using relatedness-based diagnostics alongside model-based inference, we were able to harness the advantages of model-based inference and provide a framework for future P. vivax molecular correction.
Twohig, K. C.; Mansour, M.; Pugar, J. A.; Yuan, K.; Pocivavsek, L.; Klishin, A. A.
Show abstract
Biological systems evolve as continuous dynamical processes, but at organ-scale and across human lifespans they are rarely observed longitudinally--population data typically exist instead as sparse, cross-sectional snapshots. Inferring lifespan dynamics from such data requires methods distinct from those used at cellular and tissue scales where dense observations are accessible. We address this problem in the thoracic aorta, where surgical decisions currently rest on static, age- and sex-agnostic diameter thresholds that reduce three-dimensional morphology to a single scalar. Treating normal aortic morphology as a stochastic dynamical system, we pose a continuous-time drift-diffusion process in a two-coordinate state space of normalized surface area (A) and normalized fluctuation in integrated Gaussian curvature ({delta} K), and fit closed-form solutions of the Fokker-Planck equation by maximum likelihood to a sex-balanced, age-uniform cohort spanning infancy to age 99. Inter-individual variability is treated as a fitted diffusion parameter rather than as residual scatter, which is distinct from prior normative studies that report variability as scatter around a regression line. The framework identifies two growth regimes for aortic size (childhood expansion followed by persistent adult growth, with adult males growing approximately 70% faster than adult females) and a single dynamical regime for aortic shape, with heteroscedastic variability accumulating at a rate comparable to the mean drift over the lifespan. Applied to independent cohorts of acute and chronic thoracic aortic dissections, the multivariate model identifies over 95% as statistical outliers via Mahalanobis distance, consistently outperforming either coordinate alone. The same probabilistic envelope that describes normal aging thus defines a baseline against which disease can be detected, supporting a shift toward dynamic, age- and sex-aware assessment of thoracic aortic pathology.
Cascalho, A.; Sati, A.; Dhondt, H.; Schoonvliet, N.; Kaempf, N.; Coccia, E.; Mamalaki, A.; Behrens, M. I.; Brüggemann, N.; Glatzel, M.; Baekelandt, V.; Klein, C.; Eggermont, J.; Verstreken, P.; Blanchard, J.; Vangheluwe, P.
Show abstract
Pathogenic variants in ATP13A2, which encodes an endolysosomal polyamine exporter, cause Kufor-Rakeb syndrome and are associated with early-onset parkinsonism and related neurodegenerative disorders, however, the mechanisms by which ATP13A2 dysfunction drives disease remain incompletely defined. In Atp13a2 knockout mice, we identified an early, transient reduction in brain polyamines that precedes overt gliosis and behavioural abnormalities. Pharmacological polyamine depletion exacerbates phenotypes, whereas oral supplementation of spermidine, but not spermine, rescues parkinsonian symptoms establishing metabolic polyamine deficiency as a pathogenic driver. Mechanistically, spermidine counteracts microglia lysosomal dysfunction in the brain and exerts mitochondrial antioxidant and anti-inflammatory effects in primary mouse microglia, thereby improving neuronal integrity. In the absence of Atp13a2, microglial spermidine import relies on the related polyamine transporter Atp13a3. Importantly, these findings translate to human systems, whereby spermidine attenuates inflammation in ATP13A2-deficient human differentiated microglia, while postmortem ATP13A2-deficient brain analysis confirms increased microglia reactivity. Spermidine also rescues motor deficits and dopaminergic neuron loss in ATP13A2-deficient Drosophila and other fly parkinsonism models. Together, these findings identify early polyamine dysregulation as a mechanistic contributor to ATP13A2-associated parkinsonism and nominate spermidine supplementation as a potential therapeutic strategy for ATP13A2-driven pathology and possibly a broader range of parkinsonian sub-types.